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Abstract 

A thermal resistance network is used to predict the performance of the 
Multi Chip Units (MCUs) in the VAX 9000™ computer. This branched net- 
work is comprised of resistors defined by analytical, numerical and ex- 
perimental techniques. Effects of thermal conduction, contact resistance and 
convection are included. A comparison is made between the model's tem- 
perature predictions and test data. Agreement within 15% is achieved, 
demonstrating that the chips in the MCU will operate well below their 
specification limit of 85° C. 
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Nomenclature 



A 


area (m z ) 


B 


coefficient 


c 


coefficient 


H 


contact microhardness (MPa) 


h 


neat conductance (U7ra -a) 


k 


thermal conductivity (W/m-K) 


k s 


harmonic mean thermal conductivity (W/m-K) 


L 


half the width of a chip (m) 


m 


mean absolute asperity slope 


M 


gas parameter 


Nu 


Nusselt number 


I) 

r 


contact pressure (MPa) 


rr 


Prandtl number 


Q 


heat flow rate (W) 


K 


thermal resistance (C/W) 


r 


radial distance from bolt center (m) 


Re 
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T 


temperature (C) 


T j 


average chip junction temperature (C) 


V 


air flow rate (m^/s) 


X 


distance (m) 


Y 


distance between contacting planes (m) 


a 


effective RMS surface roughness (m) 


Subscripts 


i 


chip or circuit branch i 


j 


chip junction 


bp 


baseplate 


chip 


chip 


epoxy 


epoxy 


hs 


heat sink 


int 


baseplate-heat sink interface 


air 


ambient air 
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1. Introduction 

The VAX 9000™ computer extends Digital's style of computing into the main frame domain. 
It is a highly reliable computer with five times the performance of previous VAX™ systems. 
This is achieved, in part, by packaging several Multi Chip Units (MCUs) in close proximity to 
each other. These MCUs generate large amounts of heat, yet are cooled efficiently with air. 
Designing MCUs which maintain low chip temperatures, leading to higher system reliability, 
was a major goal of the program. A thermal model which aided in the design process was essen- 
tial. 

Modeling Multi Chip Units is generally more complicated than modeling single chip 
packages. This is due to three major differences: 1) Multiple heat sources mean that many of 
the heat paths are in parallel, leading to more complicated thermal networks. 2) A chip can 
influence its neighbor, particularly if some are operating intermittantly. 3) Within one computer 
there are many MCUs, each with different chips in different patterns. A thermal model must be 
robust enough to account for these complexities and variations. In many cases, the odd 
geometries and boundary conditions in MCUs are too complex to model analytically. Layout 
variations and complexities may make numerical grids large and solutions impractical, par- 
ticularly when design changes are frequent, requiring repeated solutions. 

The thermal model must serve the needs of packaging engineers concerned with electrical and 
mechanical performance, reliability engineers concerned with maintaining a cool, reliable sys- 
tem, and manufacturing inspectors who must verify the thermal performance of manufactured 
units. When the model must be used repeatedly by a large community, a familiar and easily used 
format, such as a spread sheet, is desirable. 

In many cases applying a one-dimensional network of thermal resistances to the MCU meets 
these needs. Each component in the thermal path can be assigned a resistance. Accurately 
defining that resistance then becomes the challenging part of the model. In some cases, direct 
analytical relationships can be used. Numerical results that are easily scaled to changes in chip 
types or powers may also be useful. Finally, experimental results lead to empirical definitions 
which fine tune the resistance values. 

The VAX 9000 MCU thermal model is comprised of resistances determined using all of these 
techniques. The chip and epoxy are analytically described using Fourier's law. Numerical 
results are combined with experimental results to describe the baseplate and contact resistance 
between it and the heat sink. Test results define the performance of the heat sink empirically. 
The model is easy to use and helped make the VAX 9000 a very fast computer with chips operat- 
ing well below the specified operating limit of 85°C. 

This paper explains how the VAX 9000 MCU thermal model was created. Examples of cal- 
culations and test results are given with sufficient detail that the approach can be understood and 
applied to other situations. 
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2. Hardware Description 

The VAX 9000 MCUs can accommodate various configurations of gate array, RAM and cus- 
tom chips. The chips are surrounded by the High Density Signal Carrier (HDSC), a complex 
polyimide and copper structure which supplies signals and power. Connections from the chips to 
this carrier are via Tape Automated Bonding (TAB). The outer leads of the TAB are soldered to 
the HDSC. Both the HDSC and the chips are supported by a copper baseplate. The HDSC, 
which generates no significant heat, is simply laminated to the copper baseplate. Each chip is 
epoxied to the baseplate through cutouts in the HDSC. The baseplate serves as the assembly 
foundation for the electrical components. An air-cooled heat sink is attached to the opposite side 
of the baseplate with a regular array of nine cap screws. See figure 1. 
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Figure 1: An exploded side view of the MCU. 
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2.1. Silicon Chips 

There are four sizes of silicon chips in the VAX 9000. The gate array chips are 9.8 x 9.8mm. 
The custom clock chip is 6.4 x 6.4mm. Some MCUs require RAM chips that are either 3.6 x 
4.9mm or 4.2 x 6.3mm. All are roughly 0.5mm thick. These emitter-coupled logic (ECL) 
devices have high power densities: A gate array may consume 30 watts (30 W/cm 2 ), while a 
small RAM consumes up to 2 watts (11 W/cm 2 ). 

2.2. Epoxy 

The bottom sides of the chips are epoxied to the top of the copper baseplate. Several attributes 
are required of this epoxy. Electrically, it must be an insulator. Thermally, it must be an excel- 
lent conductor. The epoxy must be resilient because the thermal expansion coefficient of the 
copper baseplate is much higher than that of the silicon. Therefore, a high percentage of elon- 
gation is needed in the epoxy. Finally, it must perform consistently in the manufacturing 
process, yielding thin, strong, continuous bond lines. 

The resin system used has a very high percentage of elongation. To obtain a high thermal 
conductivity and an electrically insulating joint, fine diamond particles are added to the resin. 
By keeping the bond line thickness between 0.025 and 0.05mm, the thermal performance is ac- 
ceptable, and the structural integrity of the joint is assured. 

2.3. Baseplate 

The structural foundation for the MCU is the baseplate. The HDSC is laminated to this flat, 
rigid, 10 x 10 x 0.9cm thick plate. The chips are bonded to the plate through the cutouts in the 
HDSC. 

The baseplate must have a high thermal conductivity so that the heat introduced at the epoxy 
joint can spread effectively through its thickness. Nine blind tapped holes on 3 cm centers in the 
bottom of the baseplate allow the heat sink to be attached with #10-32 cap screws. Chrome- 
copper alloy (CDA 182) was found to meet these requirements. To reduce scratching during 
manufacturing, an electroless nickel plating is applied. 

2.4. Heat Sink 

A goal of the VAX 9000 design was to use quiet, low velocity air to cool the MCUs. This was 
accomplished by utilizing a pin fin heat sink for each MCU. 

This heat sink is made by pressing cylindrical pins into an aluminum base. The 600 staggered 
pins increase the heat transfer surface area by 8:1 over that of the flat base alone. Various 
designs of the pin fin heat sink were considered, including copper and aluminum pins of dif- 
ferent diameters and pressing the pins into the base versus dip brazing or casting them. 

Air is supplied perpendicularly to the center of the heat sink, from a 5cm diameter nozzle. The 
flow behaves like uniformly-approaching flow or wedge flow, since the nozzle is large compared 
to the heat sink. As the air enters the center of the fin array and hits the base of the heat sink, it 
axisymmetrically turns and passes across the fins in a radial direction. 
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3. Thermal Model 

The thermal network for the MCU consists of conduction and convection paths. Heat 
generated on the top surface of the chips, passes through the silicon and then through the epoxy. 
From the epoxy, the heat spreads into the baseplate, through the bolted joint and into the heat 
sink where it is liberated to the passing air. A very small amount of heat passes out of the chip 
through the TAB leads and is not included in the model. The thermal network for the MCU may 
be depicted as in figure 2. 
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Figure 2: Thermal resistance network for the MCU 
with up to n various chips. 

Note that portions of the network are in parallel. Since an MCU includes chips of various 
powers and cross-sectional areas, parallel paths are needed to adequately describe the system. 
The temperature at the bottom of the baseplate is assumed to be isothermal. Experimental results 
support this assumption. Thermal coupling from chip to chip, within the baseplate, is not 
handled explicitly. This would require that the model be multi-dimensional and possibly solved 
iteratively. This coupling is discussed in the Baseplate section below. From the bottom of the 
baseplate to the air, the network is based on the total MCU area and power. A boundary con- 
dition on the system is the ambient air temperature. Note that the heat fluxes are constant, but 
not equal, through the various branches of the network. Each resistance in this network is dis- 
cussed below. 
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3.1. Silicon Thin Film Heater Chips 

In order to estimate the thermal performance of the MCUs early in the program, silicon thin 
film resistance heaters were used to simulate the real chips. These heater chips were produced to 
the same dimensions as the real chips. A thin film of NiCr across the top of the silicon provided 
a uniform resistance. These heater chips allowed us to examine various MCU configurations and 
verify the thermal model. The following results and discussions are based on experiments with 
such heater chips. 

A simple model describes the temperature drop through a chip by Fourier's law in one dimen- 
sion. This description requires that the heat flux be assumed uniform. The expression below 
provides the average thermal resistance through a chip with a given thickness, conductivity and 
area. 

CD 

^chip chip 



3.2. Epoxy 

Fourier's law is also used to describe the resistance through the epoxy as 
A r 

R _ epoxy (2) 
epoxy epoxy 

The thermal conductivity of this new diamond-filled epoxy was not known. Tests were designed 
to determine this essential property in geometries similar to the real epoxy joint. Experiments 
produced a value of k = 1.4 Wlm-K with an experimental uncertainty of ± 10%. 



3.3. Baseplate 

Analytical, numerical and experimental techniques were used to understand the thermal per- 
formance of the baseplate. Since this was the element that coupled all the chips to the heat sink, 
it received a great deal of attention. Analytical techniques were initially attempted using the 
results of Kennedy. [3] Bolt holes in the baseplate and a non-uniform contact pressure distribu- 
tion made this solution difficult to use, and results were found to be unreliable. 

Numerical models were used to study one symmetrical quadrant of a chip and the components 
between it and the cooling air. Both axisymmetrical (SINDA) and cartesian (ANSYS®) grid 
geometries were investigated. This method proved very useful, as many baseplate thicknesses, 
materials and bolt hole sizes could be evaluated to determine optimum combinations. The 
boundary conditions at the bottom of the baseplate required particular attention, and are dis- 
cussed in the Interface section. 

An experiment was run to determine how well the numerical models described the tempera- 
ture field within the baseplate. Temperatures within the baseplate were measured by probing 
0.5mm diameter holes with thermocouples. This provided plots of temperature versus radial dis- 
tance from the center of the chip, in various planes within the thickness of the baseplate. A 
comparison of two numerical results and the temperature measurements within a baseplate with 
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nine 30 watt chips on it, is shown in figure 3. The reasonable agreement between the experimen- 
tal and numerical results permitted us to use the numerical model to determine the temperature 
distribution within the baseplate. 

The network in figure 2 does not account for thermal coupling from chip to chip within the 
baseplate. At the cost of much greater complexity, the accuracy of this model could be improved 
by replacing the one-dimensional baseplate resistances with a mesh of inter-connected resis- 
tances. However, our simpler model may be justified by noting that already at r/L = 3 (1.5 chip 

widths away from the center of the chip) approaches zero. This suggests the spacing at 

which the heat flux from one chip begins to encroach upon that of its neighbor. All gate arrays 
are spaced at least three chip widths apart, reducing their thermal influence on each other. RAM 
chips are spaced somewhat closer together, but their power per unit area of baseplate is lower. 
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Figure 3: Baseplate temperature profiles in a plane located 
1.3mm below the top of the baseplate. 
T re jris a temperature at the bottom of the baseplate. 
L=5mm = one half of this heater chip's width. 



Temperature profiles such as those in figure 3 are different for chips with various sizes and 
powers. Obtaining these specific curves for each unique chip layout would require hundreds of 
numerical models. In order to have an empirical model that easily fits the spread sheet 
paradigm, the temperature drop through the baseplate was converted into an apparent heat con- 
ductance for that portion of the baseplate. 
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/XI bp,i A bp,i 

where A bpi is each chip's allotted portion of the baseplate. For example, for a symmetrical 

layout of nine chips on an MCU, A bpi = l - of the total baseplate area. Or for a more complex 

case, if an MCU were made up of one tight cluster of six RAMS and seven separate custom 
chips, then the entire RAM array might be assigned ^ of the baseplate's total area. Each of the 

six RAMs could then be allotted ^ of the RAM array area or ^ of the entire baseplate area. This 

approximation becomes less valid for highly asymmetrical chip layouts. A bp i is graphically 
determined from MCU drawings. The AT b ; - used in equation 3 was computed and measured 
for a particular, but representative, chip layout. The thermal resistance through the baseplate 
may now be approximated by assuming that the local heat conductance is the same throughout 
the baseplate. Under any particular chip, 

h bp = h b P ,i ( 4 ) 
which yields the following: 

R bp,i = T — \ ( 5 ) 

n bp bp,i 

While it is clear that these approximations can limit the accuracy of the model, particularly for 
unusual chip layouts, the simplicity makes the model accessible to a large user community. 



3.4. Interface 

A contact resistance exists between the baseplate and the heat sink. This resistance can be 
very significant depending on materials, the condition of the surfaces and the contact pressure. 
A detailed understanding of these parameters was needed to design an effective, reliable joint. 

Manufacturing and performance requirements constrained the material selection for the 
baseplate (Cr-Cu) and heat sink (Al). The nickel plating on the copper baseplate increases the 
thermal resistance of the joint, since it increases the surface hardness, however the benefit of 
scratch resistance outweighs the thermal cost. Experimentally it was found that a uniform sur- 
face finish between 0.2 - 0.4 micron RMS minimized the contact resistance. Finishes outside 
this range increased the resistance. 

The joint contact pressure can be controlled by choosing an appropriate torque for the cap 
screws that attach the heat sink. However, the pressure distribution in the interface is not 
uniform, and depends on variables such as the bolt locations and sizes, plate materials and thick- 
nesses. 

To properly estimate the contact resistance in this joint, the pressure distribution was inves- 
tigated. Mikic and Gould have described a method to determine the contact area between two 
thin plates that are bolted together at the center. [5] Analysts at Digital built a similar numerical 
model. This numerical scheme was used to predict the interface pressure versus distance, r, from 
the bolt. It is interesting to note that the model suggests that the contact pressure drops off to 
zero at roughly three bolt radii from the center of the bolt. The local pressure was then used in a 
contact conductance expression suggested by Yovanovich. [6]This expression defines the con- 
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ductance through the solid-to-solid contacts and the air gaps, which are much less significant in 
our case. 

*(r) = 1.25— Y— \0-95 + J^!_ (6) 



m = <m 2 bp + m 2 hs (7) 
k bp + k hs 



°=>K,+<£ (9) 
y=1.53o(^-°- 097 (10) 

where m is the mean absolute asperity slope, k s is the harmonic mean thermal conductivity of the 
two contacting solids, o is the effective RMS surface roughness, H is the contact microhardness 
of the softer material, Y is the distance between the mean planes of the contacting rough surfaces 
[1], and M is a gas parameter which is dependent upon surface characteristics and ther- 
modynamic properties. [6] Equation 6 yields a local heat conductance, which was used in the 
baseplate numerical model, or may be integrated around the bolt for an average value. 

Rearranging the average heat conductance yields an average thermal resistance for the inter- 
face. 



nA int 

Equation 6 was used to help make decisions about various design options. Once a design was 
chosen, experiments were performed to evaluate the interface performance. Figure 4 shows 
results that were easily reproduced, even when parts were mixed and matched. At low contact 
pressures (low bolt torques) the resistance becomes erratic and increases toward a limit described 
by conduction through an air gap. At high contact pressures all significant plastic deformations 
in the surface asperities have taken place, maximizing the solid- to- solid contact area, thereby 
minimizing the thermal resistance. Based on thread strength in the baseplate, a bolt torque of 4 
N-m was specified for the MCU assembly. R int =0.0l°C/W was chosen for use in the model 
since it is a reasonable upper limit given normal process variations. 



3.5. Heat Sink 

Analytical modeling of the heat sink proved to be very difficult. Air flow approaching per- 
pendicularly to a flat heat sink without fins can be described by wedge flow solutions [2], but the 
addition of pin fins in the flow field disqualifies direct use of this type of solution. At best, 
closed form solutions reveal trends which aid in choosing fin characteristics. Empirical expres- 
sions for the heat sink performance were the only reasonable methods to use in our model. The 
design of the heat sink was strongly driven by manufacturability and cost considerations. Once a 
design was chosen, tests were run on the heat sink to examine nozzle designs, pressure drops, 
thermal resistances and noise at various flow rates. These data were then fitted to an ap- 
proximate theoretical correlation. 
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Figure 4: Thermal contact resistance between the baseplate 

and the heat sink. 

Theoretically, flow over banks of cylinders may be described by the following relationship, 

Nu = B Pr 33 Re c (12) 

where B and C are constants based upon the cylinder size and row and column layout. [4] The 
VAX 9000 heat sink may be described as a bank of cylinders with air passing through the field 
radially. For our geometry, values for the constants might be expected to be in the neighborhood 
of 5 = 0.5 and C=0.57. There are two major difficulties in directly applying this kind of model. 
One is that since the flow of air is spreading out radially, the Reynolds number decreases with 
distance from the center of the heat sink. The second is that our fin layout is staggered dif- 
ferently from the available tube bank correlations. In spite of this, the experimental data of 
figure 5 had a good fit to the correlation in equation 12. 



Noting that 



R~ 



1 

Nu 



and the volumetric flow rate is proportional to the Reynolds number, 
V~Re 

our data fits the following empirical expression. 
5^ = 0.012 p r --33y-.55 



(13) 

(14) 
(15) 
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Equation 15 provides an adequate prediction of R hs for any flow rate of interest in this applica- 
tion. The change in the enthalpy of the air is included in the resistance of the heat sink. A single 
5cm diameter nozzle per MCU produced the best performance and was used for the data of 
figure 5. Other designs tested consisted of single or multiple smaller nozzles. 
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Figure 5: Thermal resistance of a heat sink with aluminum fins. 



4. Test Results and Model Comparisons 

By assembling the resistances described above, a model can be built for any particular MCU. 
The model can handle any of the four chip types in various layouts. A spread sheet was used to 
handle this task, making all of the calculations quick and easy for a large community, including 
those without thermal engineering experience. For any particular chip, i, the predicted junction 
temperature is 

T j,i = ( R chip,i + R epoxy,i + R bp,i ">Qi < 1 6 ^ 

+ ( R int + R hs)Q +T air 
with an uncertainty of ± 15%. 

To verify the model's accuracy, the silicon heater chips were assembled to baseplates in place 
of real chips. Several MCU patterns with all four heater chip types were built using normal 
manufacturing equipment and processes. While these MCUs were powered, an infra-red camera 
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was used to determine the average temperatures on the surface of the chips. The uncertainty for 
the temperature measurement was ± 4 C. These measured temperatures were then compared to 
those predicted by the thermal model. Figure 6 shows good agreement between the model and 




40 50 60 70 80 90 

Average Tj Measured (C) 

Figure 6: Average surface temperatures of silicon heater chips. 

experiment. As expected, the hottest devices are the gate arrays (figure 6, group A), while the 
RAMS run cooler. The data in figure 6 are from many different MCUs with various chip pat- 
terns and total powers ranging from 134 to 220 watts. It is important to note that the same type 
of chip on two different MCUs may not run at the same temperature. Note that some low-heat 
flux RAMS (figure 6, group B) ran hotter than high heat flux RAMS (figure 6, group C). This 
occurred when the low heat flux RAMS were on an MCU with high total power, and vice versa. 
A particular chip's operating temperature is directly affected by the total MCU power, as seen in 
the thermal network, figure 2. Herein lies a complication with modeling a multi chip unit one- 
dimensionally. 

It is also interesting to note the percentages of the total temperature rise that each component 
in the thermal network accounts for. For a typical MCU generating 200 watts, the chip and 
contact interface account for less than 5% each. The epoxy and baseplate each account for 15 to 
20%. The heat sink has the largest impact on the chip temperature, contributing more than 50% 
of the total temperature rise. 
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5. Conclusions 

For the VAX 9000, the many MCU chip layouts demanded a simple modeling approach. A 
thermal network composed of resistances representing major components in the MCU was 
developed. A combination of analytical, numerical and experimental approaches were used to 
derive the values of these thermal resistances. The measured temperatures of heater chips are 
within 15% of the model's predictions, indicating that chips on the VAX 9000 should operate 
well below the specified limit of 85°C. 

Although this specific thermal model can not be used directly on other multi chip module 
designs, it suggests that this approach to modeling can have universal appeal to a large user 
community and can be done quite simply and effectively. 
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